Scene Modeling and Image Mining with a Visual Grammar
نویسندگان
چکیده
Automatic content extraction, classification and content-based retrieval are highly desired goals in intelligent remote sensing databases. Pixel level processing has been the common choice for both academic and commercial systems. We extend the modeling of remotely sensed imagery to three levels: Pixel level, region level and scene level. Pixel level features are generated using unsupervised clustering of spectral values, texture features and ancillary data like digital elevation models. Region level features include shape information and statistics of pixel level feature values. Scene level features include statistics and spatial relationships of regions. This chapter describes our work on developing a probabilistic visual grammar to reduce the gap between low-level features and high-level user semantics, and to support complex query scenarios that consist of many regions with different feature characteristics. The visual grammar includes automatic identification of region prototypes and modeling of their spatial relationships. The system learns the prototype regions in an image collection using unsupervised clustering. Spatial relationships are represented by fuzzy membership functions. The system automatically selects significant relationships from training data and builds visual grammar models which can also be updated using user relevance feedback. A Bayesian framework is used to automatically classify scenes based on these models. We demonstrate our system with query scenarios that cannot be expressed by traditional region or scene level approaches but where the visual grammar provides accurate classifications and effective retrieval.
منابع مشابه
Probabilistic Retrieval with a Visual Grammar
We describe a system for content-based retrieval and classification of multispectral images. Our system models images on pixel, region and scene levels. To reduce the gap between low-level features and highlevel user semantics, and to support complex query scenarios that consist of many regions with different feature characteristics, we propose a probabilistic visual grammar that includes autom...
متن کاملJust Noticeable Difference Estimation Using Visual Saliency in Images
Due to some physiological and physical limitations in the brain and the eye, the human visual system (HVS) is unable to perceive some changes in the visual signal whose range is lower than a certain threshold so-called just-noticeable distortion (JND) threshold. Visual attention (VA) provides a mechanism for selection of particular aspects of a visual scene so as to reduce the computational loa...
متن کاملUnderstanding image concepts using ISTOP model
This paper focuses on recognizing image concepts by introducing the ISTOP model. The model parses the images from scene to object's parts by using a context sensitive grammar. Since there is a gap between the scene and object levels, this grammar proposes the “Visual Term” level to bridge the gap. Visual term is a higher concept level than the object level representing a few co-occurring object...
متن کاملGraph-based Visual Saliency Model using Background Color
Visual saliency is a cognitive psychology concept that makes some stimuli of a scene stand out relative to their neighbors and attract our attention. Computing visual saliency is a topic of recent interest. Here, we propose a graph-based method for saliency detection, which contains three stages: pre-processing, initial saliency detection and final saliency detection. The initial saliency map i...
متن کاملCompressed-Sampling-Based Image Saliency Detection in the Wavelet Domain
When watching natural scenes, an overwhelming amount of information is delivered to the Human Visual System (HVS). The optic nerve is estimated to receive around 108 bits of information a second. This large amount of information can’t be processed right away through our neural system. Visual attention mechanism enables HVS to spend neural resources efficiently, only on the selected parts of the...
متن کامل